Mining Association Rules from Structural Deltas of Historical XML Documents

نویسندگان

  • Ling Chen
  • Sourav S. Bhowmick
  • Liang-Tien Chia
چکیده

Previous work on XML association rule mining focuses on mining from the data existing in XML documents at a certain time point. However, due to the dynamic nature of online information, an XML document typically evolves over time. Knowledge obtained from mining the evolvement of an XML document would be useful in a wide range of applications, such as XML indexing, XML clustering. In this paper, we propose to mine a novel type of association rules from a sequence of changes to XML structure, which we call XML Structural Delta Association Rule (XSD-AR). We formulate the problem of XSD-AR mining by considering both the frequency and the degree of changes to XML structure. An algorithm, which is derived from the FP-growth, and its optimizing strategy are developed for the problem. Preliminary experiment results show that our algorithm is efficient and scalable at discovering a complete set of XSD-ARs.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

XML structural delta mining: Issues and challenges

Recently, there is an increasing research efforts in XML data mining. These research efforts largely assumed that XML documents are static. However, in reality, the documents are rarely static. In this paper, we propose a novel research problem called XML structural delta mining. The objective of XML structural delta mining is to discover knowledge by analyzing structural evolution pattern (als...

متن کامل

Mining tree-based association rules from XML documents

The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semistructured datasets. In this work we describe an approach to mine Tree-based association rules from XM...

متن کامل

Mining Frequently Changing Substructures from Historical Unordered XML Documents

Recently, there is an increasing research efforts in XML data mining. These efforts largely assumed that XML documents are static. However, in many real applications, XML data are evolutionary in nature. In this paper, we focus on mining evolution patterns from historical XML documents. Specifically, we propose a novel approach to discover frequently changing structures (FCS) from a sequence of...

متن کامل

Mining Association Rules from Xml Document

In this work we describe an approach to mine Tree-based association rules from XML documents. Such rules provide information on both the structure and the content of XML documents; moreover, they can be stored in XML format to be queried later on. The mined knowledge is approximate, intensional knowledge used to provide: (i) quick, approximate answers to queries and (ii) information about struc...

متن کامل

Deriving General Association Rules from XML Data

XML documents have become poplar because the semi-structure nature of XML allows a wide variety of data to be represented in XML. Association rule mining is an important problem in the data mining domain. Currently, the problem of association rule mining on XML data has not been well studied. Existing work only addresses the problem of mining specific association rules from XML data. Such techn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004